Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix LXD lowering host bridge MTU #11919

Merged
merged 3 commits into from
Jul 4, 2023
Merged

Fix LXD lowering host bridge MTU #11919

merged 3 commits into from
Jul 4, 2023

Conversation

stgraber
Copy link
Contributor

@stgraber stgraber commented Jul 4, 2023

When dealing with reasonably complex bridges that are not LXD managed, in my case, a bridge using VLAN filtering, it's not unusual for some VLANs to use a different MTU than others.

The problem is that currently LXD slams the same MTU on both the host and guest side device.
This feels like a good idea overall, if it wasn't for the fact that the Linux kernel will automatically lower the bridge MTU should any device be added to it with an MTU lower than its current value.

In practice what this means is that starting a single instance which has a network interface using an MTU lower than the current one on the bridge (1500 vs 9000 in my case) will cause the bridge MTU to get lowered and break a whole bunch of stuff.

Instead what we need to do is ensure that the host side device always has the same MTU as the bridge it's being put into, then set the user request MTU on the guest side device instead.

This all does cause a small behavior difference though, as MTU stands for maximum "transmission" unit, the kernel only enforces it on egress, not on ingress. So while with the current (buggy) implementation, this effectively ensures that a container with mtu=1500 will never receive a packet > 1500, the fixed logic technically doesn't prevent it.
A host with a bridge MTU of 9000 and a container with an interface MTU of 1500 will now be able to receive packets headed to it with a MTU of up to 9000, but it won't be able to send any response or any new packet out with an MTU higher than its configured 1500.

This certainly isn't ideal but the current behavior is also a grey area as we have a bridge that can be forced back to its full MTU of 9000 while retaining some devices inside of it with an MTU of 1500. This most likely results in packets being dropped somewhere in the kernel, rather than the physical version of this setup where they would just get truncated to match the MTU.

stgraber added 2 commits July 3, 2023 23:33
Signed-off-by: Stéphane Graber <[email protected]>
Signed-off-by: Stéphane Graber <[email protected]>
@stgraber
Copy link
Contributor Author

stgraber commented Jul 4, 2023

This issue is why I keep getting some downtimes on my production cluster during cluster restores.
The MTU of br0 gets lowered to 1500 when one of my container starts, which then breaks the OVN uplink and some other critical networks (BGP) until I go and manually reset the bridge MTU back to its 9000 value.

@tomponline tomponline merged commit 8d9f3c2 into canonical:master Jul 4, 2023
tomponline added a commit to canonical/lxd-pkg-snap that referenced this pull request Jul 7, 2023
@stgraber stgraber deleted the master branch July 19, 2023 13:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants